eye movement data
Dynamic Fusion of Eye Movement Data and Verbal Narrations in Knowledge-rich Domains
We propose to jointly analyze experts' eye movements and verbal narrations to discover important and interpretable knowledge patterns to better understand their decision-making processes. The discovered patterns can further enhance data-driven statistical models by fusing experts' domain knowledge to support complex human-machine collaborative decision-making. Our key contribution is a novel dynamic Bayesian nonparametric model that assigns latent knowledge patterns into key phases involved in complex decision-making. Each phase is characterized by a unique distribution of word topics discovered from verbal narrations and their dynamic interactions with eye movement patterns, indicating experts' special perceptual behavior within a given decision-making stage. A new split-merge-switch sampler is developed to efficiently explore the posterior state space with an improved mixing rate. Case studies on diagnostic error prediction and disease morphology categorization help demonstrate the effectiveness of the proposed model and discovered knowledge patterns.
Task Decoding based on Eye Movements using Synthetic Data Augmentation
Sadhu, Shanmuka, Baran, Arca, Pandey, Preeti, Kumar, Ayush
Machine learning has been extensively used in various applications related to eye-tracking research. Understanding eye movement is one of the most significant subsets of eye-tracking research that reveals the scanning pattern of an individual. Researchers have thoroughly analyzed eye movement data to understand various eye-tracking applications, such as attention mechanisms, navigational behavior, task understanding, etc. The outcome of traditional machine learning algorithms used for decoding tasks based on eye movement data has received a mixed reaction to Yarbus' claim that it is possible to decode the observer's task from their eye movements. In this paper, to support the hypothesis by Yarbus, we are decoding tasks categories while generating synthetic data samples using well-known Synthetic Data Generators CTGAN and its variations such as CopulaGAN and Gretel AI Synthetic Data generators on available data from an in-person user study. Our results show that augmenting more eye movement data combined with additional synthetically generated improves classification accuracy even with traditional machine learning algorithms. We see a significant improvement in task decoding accuracy from 28.1% using Random Forest to 82% using Inception Time when five times more data is added in addition to the 320 real eye movement dataset sample. Our proposed framework outperforms all the available studies on this dataset because of the use of additional synthetic datasets. We validated our claim with various algorithms and combinations of real and synthetic data to show how decoding accuracy increases with the increase in the augmentation of generated data to real data.
- North America > United States > New Jersey (0.04)
- North America > United States > Massachusetts > Suffolk County > Boston (0.04)
- Information Technology (0.47)
- Health & Medicine (0.46)
Review for NeurIPS paper: Dynamic Fusion of Eye Movement Data and Verbal Narrations in Knowledge-rich Domains
Weaknesses: I would have liked to have seen more examples in the discussion of the topics that were detected. It would be helpful if, in Table 1 and other similar illustrations the different topics that the colored words correspond to where explicitly indicated. In the supplementary material the table showing topics (Table 4) is useful, but I am curious to understand more about the links between the works in each topic category. Regarding baselines, I realize in multimodal problems, especially those using modalities that are frequently not employed (e.g., eye tracking) it is difficult to find state of the art models that are appropriate. So this is not a major criticism but it does feel that perhaps the justification of the chosen baselines could be added to.
Review for NeurIPS paper: Dynamic Fusion of Eye Movement Data and Verbal Narrations in Knowledge-rich Domains
This paper has a lot of content: Interesting cognitive science question of modelling human decision-making, data fusion of texts and eye movements, modelled with a new dynamic Bayesian nonparametric model, and introduces a new sampler for the model. This paper received a special amount of attention, 5 reviews which were needed because the paper makes several different kinds of contributions. Hence it is not a stereotypical good conference paper having one neat idea and presenting convincing theoretical or empirical support for it. Reviewers discussed the paper intensively, concluding that the paper is likely to be interesting at NeurIPS, and since there is not easy fix to make it more suitable to the format such as dividing it into two papers, it is good enough to be accepted though not among the best papers. Clarity can easily be improved by the authors, and additional details added in both the paper and the supplement.
- Information Technology > Artificial Intelligence > Cognitive Science (0.75)
- Information Technology > Artificial Intelligence > Representation & Reasoning (0.65)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.65)
Dynamic Fusion of Eye Movement Data and Verbal Narrations in Knowledge-rich Domains
We propose to jointly analyze experts' eye movements and verbal narrations to discover important and interpretable knowledge patterns to better understand their decision-making processes. The discovered patterns can further enhance data-driven statistical models by fusing experts' domain knowledge to support complex human-machine collaborative decision-making. Our key contribution is a novel dynamic Bayesian nonparametric model that assigns latent knowledge patterns into key phases involved in complex decision-making. Each phase is characterized by a unique distribution of word topics discovered from verbal narrations and their dynamic interactions with eye movement patterns, indicating experts' special perceptual behavior within a given decision-making stage. A new split-merge-switch sampler is developed to efficiently explore the posterior state space with an improved mixing rate.
EMTeC: A Corpus of Eye Movements on Machine-Generated Texts
Bolliger, Lena Sophia, Haller, Patrick, Cretton, Isabelle Caroline Rose, Reich, David Robert, Kew, Tannon, Jäger, Lena Ann
The Eye Movements on Machine-Generated Texts Corpus (EMTeC) is a naturalistic eye-movements-while-reading corpus of 107 native English speakers reading machine-generated texts. The texts are generated by three large language models using five different decoding strategies, and they fall into six different text type categories. EMTeC entails the eye movement data at all stages of pre-processing, i.e., the raw coordinate data sampled at 2000 Hz, the fixation sequences, and the reading measures. It further provides both the original and a corrected version of the fixation sequences, accounting for vertical calibration drift. Moreover, the corpus includes the language models' internals that underlie the generation of the stimulus texts: the transition scores, the attention scores, and the hidden states. The stimuli are annotated for a range of linguistic features both at text and at word level. We anticipate EMTeC to be utilized for a variety of use cases such as, but not restricted to, the investigation of reading behavior on machine-generated text and the impact of different decoding strategies; reading behavior on different text types; the development of new pre-processing, data filtering, and drift correction algorithms; the cognitive interpretability and enhancement of language models; and the assessment of the predictive power of surprisal and entropy for human reading times. The data at all stages of pre-processing, the model internals, and the code to reproduce the stimulus generation, data pre-processing and analyses can be accessed via https://github.com/DiLi-Lab/EMTeC/.
- Europe > Switzerland > Zürich > Zürich (0.14)
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- North America > United States > New York > New York County > New York City (0.04)
- (24 more...)
- Health & Medicine > Therapeutic Area > Neurology (1.00)
- Education (1.00)
- Government > Regional Government > North America Government > United States Government (0.92)
Modeling Human Eye Movements with Neural Networks in a Maze-Solving Task
Li, Jason, Watters, Nicholas, Yingting, null, Wang, null, Sohn, Hansem, Jazayeri, Mehrdad
From smoothly pursuing moving objects to rapidly shifting gazes during visual search, humans employ a wide variety of eye movement strategies in different contexts. While eye movements provide a rich window into mental processes, building generative models of eye movements is notoriously difficult, and to date the computational objectives guiding eye movements remain largely a mystery. In this work, we tackled these problems in the context of a canonical spatial planning task, maze-solving. We collected eye movement data from human subjects and built deep generative models of eye movements using a novel differentiable architecture for gaze fixations and gaze shifts. We found that human eye movements are best predicted by a model that is optimized not to perform the task as efficiently as possible but instead to run an internal simulation of an object traversing the maze. This not only provides a generative model of eye movements in this task but also suggests a computational theory for how humans solve the task, namely that humans use mental simulation.
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
- Europe > Netherlands > Drenthe > Assen (0.04)
Latent gaze information in highly dynamic decision-tasks
Digitization is penetrating more and more areas of life. Tasks are increasingly being completed digitally, and are therefore not only fulfilled faster, more efficiently but also more purposefully and successfully. The rapid developments in the field of artificial intelligence in recent years have played a major role in this, as they brought up many helpful approaches to build on. At the same time, the eyes, their movements, and the meaning of these movements are being progressively researched. The combination of these developments has led to exciting approaches. In this dissertation, I present some of these approaches which I worked on during my Ph.D. First, I provide insight into the development of models that use artificial intelligence to connect eye movements with visual expertise. This is demonstrated for two domains or rather groups of people: athletes in decision-making actions and surgeons in arthroscopic procedures. The resulting models can be considered as digital diagnostic models for automatic expertise recognition. Furthermore, I show approaches that investigate the transferability of eye movement patterns to different expertise domains and subsequently, important aspects of techniques for generalization. Finally, I address the temporal detection of confusion based on eye movement data. The results suggest the use of the resulting model as a clock signal for possible digital assistance options in the training of young professionals. An interesting aspect of my research is that I was able to draw on very valuable data from DFB youth elite athletes as well as on long-standing experts in arthroscopy. In particular, the work with the DFB data attracted the interest of radio and print media, namely DeutschlandFunk Nova and SWR DasDing. All resulting articles presented here have been published in internationally renowned journals or at conferences.
- North America > United States > California > Los Angeles County > Los Angeles (0.13)
- North America > Canada > Ontario > Toronto (0.13)
- Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.04)
- (11 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Overview (1.00)
- Instructional Material (1.00)
- Leisure & Entertainment > Sports > Soccer (1.00)
- Leisure & Entertainment > Games > Computer Games (1.00)
- Information Technology (1.00)
- (9 more...)
Task Classification Model for Visual Fixation, Exploration, and Search
Kumar, Ayush, Tyagi, Anjul, Burch, Michael, Weiskopf, Daniel, Mueller, Klaus
Yarbus' claim to decode the observer's task from eye movements has received mixed reactions. In this paper, we have supported the hypothesis that it is possible to decode the task. We conducted an exploratory analysis on the dataset by projecting features and data points into a scatter plot to visualize the nuance properties for each task. Following this analysis, we eliminated highly correlated features before training an SVM and Ada Boosting classifier to predict the tasks from this filtered eye movements data. We achieve an accuracy of 95.4% on this task classification problem and hence, support the hypothesis that task classification is possible from a user's eye movement data.
- North America > United States > Colorado > Denver County > Denver (0.06)
- North America > United States > New York > Suffolk County > Stony Brook (0.05)
- Europe > Germany > Baden-Württemberg > Stuttgart Region > Stuttgart (0.05)
- (3 more...)
Learning to Predict Readability Using Eye-Movement Data From Natives and Learners
González-Garduño, Ana V. (University of Copenhagen) | Søgaard, Anders (University of Copenhagen)
Readability assessment can improve the quality of assisting technologies aimed at language learners. Eye-tracking data has been used for both inducing and evaluating general-purpose NLP/AI models, and below we show that unsurprisingly, gaze data from language learners can also improve multi-task readability assessment models. This is unsurprising, since the gaze data records the reading difficulties ofthe learners. Unfortunately, eye-tracking data from language learners is often much harder to obtain than eye-tracking data from native speakers. We therefore compare the performance of deep learning readability models that use nativespeaker eye movement data to models using data from language learners. Somewhat surprisingly, we observe no significant drop in performance when replacing learners with natives, making approaches that rely on native speaker gaze information, more scalable. In other words, our finding is that language learner difficulties can be efficiently estimated from native speakers, which suggests that, more generally, readily available gaze data can be used to improve educational NLP/AI models targeted towards language learners.
- Atlantic Ocean > Mediterranean Sea (0.04)
- Africa (0.04)
- Europe > Denmark > Capital Region > Copenhagen (0.04)